Managing Uncertain Data a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
نویسنده
چکیده
The ubiquity of uncertain data in modern-day applications (such as information extraction, data integration, sensor and RFID networks, and scientific experiments) has resulted in a growing need for techniques to deal with such data. This thesis addresses challenges in managing uncertain data in a principled, usable, and scalable fashion. We identify and explore a fundamental tension between usability and expressiveness in models for representing uncertain data. We propose a space of models for representing uncertain data, place the models in an expressiveness hierarchy, and study how the models relate to each other in terms of closure properties. We also address important problems of uniqueness testing, equivalence checking, minimization, and approximation in our space of models. For a representative model in our space (called URM), we study database design theory: We provide a sound and complete axiomatization of functional dependencies (FDs) for URM data, describe lossless decompositions, and give algorithms and complexity results for testing, finding, and inferring FDs. To address the usability-expressiveness tradeoff, we show that by adding lineage (provenance) to the URM model, we obtain a complete (intuitively, a fully expressive) data model, which we call the Uncertainty-Lineage Database (ULDB) model. We study properties of ULDBs including membership, extraction, and minimization. We develop techniques for query processing over ULDBs and show that lineage can be exploited for efficient confidence computation in ULDBs. Then, we present an extension to ULDBs that allows a seamless incorporation of data modifications and a lightweight versioning capability. Finally, we look at uncertain data management in the context of data integration. Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant up-front
منابع مشابه
Incorporating Uncertainty in Data Management and Integration a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
متن کامل
Gaze-enhanced User Interface Design a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
........................................................................................................ iv Acknowledgments ..................................................................................... vi
متن کاملStructuring Peer Interactions for Massive Scale Learning a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
....................................................................................................................... iv Acknowledgments ........................................................................................................ vi Table of
متن کاملSimulation-based Search for Hybrid System Control and Analysis a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
متن کامل
Haptics and Physical Simulation for Virtual Bone Surgery a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
......................................................................................................... iv Acknowledgments .......................................................................................... vi
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009